Packages Used

{tidyverse},{dplyr},{mosaic},{tidylog},{readxl},{ggplot2},{formatR},{knitr},{kableExtra},{cowplot},{lme4},{lmerTest},{Matrix}

knitr::opts_chunk$set(fig.path = "images/") #Output folder for all of my figures

library(tidyverse)
library(dplyr)
library(mosaic)
library(tidylog)
library(readxl)
library(ggplot2)
library(formatR)
library(knitr)
library(kableExtra)
library(cowplot)
library(lme4)
library(lmerTest)
library(Matrix) 
#NOTE: The installed versions of {Matrix} and {lme4} must be compatible binaries.
#I resolved this by uninstalling {Matrix}, {lmer4}, and {lmerTest}, and then reinstalling each following the cited *Stack Overflow* page's method:

#oo <- options(repos = "https://cran.r-project.org/")
#install.packages("Matrix")
#install.packages("lme4")
#install.packages("lmerTest")
#options(oo)

#I am commenting this here as opposed to including the above in my code to avoid having a package installation in my code
#Personally, I have {Matrix} v.1.6-5, {lme4} v1.1-35.2, and {lmerTest} v3.1-3 installed.

Introduction to Dunham et al. (2020a)

Summary

In an earlier study of symmetrical gait kinematics in 11 species of free-ranging platyrrhines, Dunham et al. (2019) founds that Saguinus tripartitus (golden-mantled tamarins) used primarily asymmetrical gaits and Cebuella pygmaea (pygmy marmosets) used exclusively asymmetrical gaits. For this reason, Dunham et al. (2019) did not include these species in their original analysis but returned to these samples in their paper, “Asymmetrical gait kinematics of free-ranging callitrichine primates in response to changes in substrate diameter and orientation” (2020a). Specifically, Dunham et al. (2020a) were interested in how the spatiotemporal kinematics of S. tripartitus and C. pygmaea gait are affected by the diameter and orientation of substrate, as well as how these kinematics impact substrate displacement during locomotion.

Video recordings of locomotor bouts were opportunistically recorded for free-ranging S. tripartitus and C. pygmaea at the Tiputini Biodiversity Station in Amazonian Ecuador (Dunham et al., 2020a). Substrate orientation was measured using a laser range finder; substrate orientation and displacement were later quantified using pixel lengths in the video recordings (Dunham et al., 2020a). Spatiotemporal gait kinematics were assessed in each video using the GaitKeeper package for MATLAB (Dunham et al., 2020a).

Statistical Methods

X^2 Tests

Dunham et al. (2020a) conducted X^2 tests to determine the frequencies of different gait types (asymmetrical vs. symmetrical) in relation to the diameter and orientation of a substrate.

Linear Mixed Models

Dunham et al. (2020a) used linear mixed models to evaluate whether substrate diameter and substrate orientation had a significant effect on each of the spatiotemporal kinematic variables in their dataset, as well as whether there was significant interaction between these fixed effects.

Dunham et al. (2020a) also used linear mixed models to evaluate whether each spatiotemporal kinematic variable in their dataset had a significant effect on substrate displacement, as well as whether there was significant interaction between these fixed effects.

Binary Logistic Mixed-effects Models

In their evaluation of whether substrate diameter and orientation had significant effects on the kinematic variables in their dataset, binary logistic mixed-effects regression was used in cases where the model did not satisfy the assumptions necessary for linear mixed model regression.

Results

Dunham et al. (2020a) found that both S. tripartitus and C. pygmaea used primarily asymmetrical gaits, with some symmetrical used at walking speeds.

Effect of substrate diameter on gait type and gait kinematics

Dunham et al. (2020a) found that substrate diameter had a significant effect on gait type in S. tripartitus and C. pygmaea. S. tripartitus individuals adjusted to more symmetrical gaits or gallops on small diameter substrates, with more half-bounds and bounds used on medium and large diameter substrates (Dunham et al., 2020a). C. pygmaea individuals also predominantly used half-bounds and bounds on medium and large diameter substrates (Dunham et al., 2020a). Additionally, substrate diameter had a significant effect on all kinematic variables measured in S. tripartitus but did not significantly effect the gait kinematics of C. pygmaea (Dunham et al., 2020a).

Effect of substrate orientation on gait type and gait kinematics

Dunham et al. (2020a) found that substrate orientation did not have a significant effect on gait type in S. tripartitus but did have a significant effect on gait type in C. pygmaea. C. pygmaea used significantly more bounding gaits on inclined substrate, and significantly more half-bounds on horizontal and declined substrates (Dunham et al., 2020a). Additionally, the cosine of substrate orientation had a significant effect on gait kinematics in both C. pygmaea and S. tripartitus, while the sine of substrate orientation only had a significant effect on gait kinematics in C. pygmaea (Dunham et al., 2020a).

Effect of relative speed on gait kinematics

Dunham et al. (2020a) found that relative speed significantly affected gait kinematics in both S. tripartitus and C. pygmaea.

Effect of gait kinematics on substrate displacement

Dunham et al. (2020a) found that gait kinematics did not have a significant effect on substrate displacement for either S. tripartitus or C. pygmaea (Dunham et al., 2020a).

Conclusion

Overall, Dunham et al. (2020a) concluded that S. tripartitus and C. pygmaea, two species of small-bodied callitrichines, use predominantly asymmetrical gaits and adjust the spatiotemporal kinematics of their gait in response to changes in substrate, a finding that informs future research into reconstructing the ancestral primate condition for gait and the arboreal niche that early primate locomotor evolution may have occurred within.

My Project Outline

Descriptive Statistics

For my descriptive statistics, I will be attempting to replicate Dunham et al. (2020a) Table 1, which summarizes the “number of strides for different substrate diameter and orientation categories in free-ranging S. tripartitus and C. pygmaea” (p. 4).

Inferential Statistics

For my inferential statistical analysis, I will be attempting to replicate the linear mixed regression analysis that Dunham et al. (2020a) used to evaluate the “effects of substrate diameter and substrate orientation on continuous kinematic variables,” including relative speed, mean duty factor, mean number of supporting limbs, relative forelimb lead duration, relative hindlimb lead duration, and percent aerial phase (p. 4). The results of this anaylsis are summarized in Dunham et al. (2020a) Table 2 (p. 6). Dunham et al. (2020a) also used binary logistic mixed-effects models for two models because the residual error for these two models was not normally distributed, meaning that they did not “satisfy assumptions of linear mixed regression analysis” (p. 4). Thus, for these models, I will be attempting to replicate the binary logistic mixed-effects regression analysis used by Dunham et al. (2020a).

Visualization

For my visualization, I will be attempting to replicate Dunham et al. (2020a) Figure 3 (p. 7), which consists of plots of various kinematic variables (mean duty factor, mean number of supporting limbs, relative forelimb lead duration, relative hindlimb lead duration, and percent aerial phase) regressed against relative speed for S. tripartitus and C. pygmaea.

Exploratory Data Analysis

Here, I’m loading the dataset from Dunham et al. (2020).

f <- "data/Dunham et al. (2020) dataset.xlsx"
d <- read_excel(f,sheet = 1, col_names = TRUE)
(head(d)) #For a view of the first few lines of the data set
## # A tibble: 6 × 14
##   Genus    relBrnchDiamNsTl Sin_SbstOrient   Cos_SbstOrient `Gait Type` relSpeed
##   <chr>               <dbl> <chr>            <chr>          <chr>          <dbl>
## 1 Saguinus            0.272 0.5              0.86602540400… Asym            5.40
## 2 Saguinus            0.360 0.5              0.86602540400… Asym            9.50
## 3 Saguinus            0.123 0.9128341770000… 0.40833046000… Asym            6.30
## 4 Saguinus            2.37  0.9563047559999… 0.29237170499… Asym            5.40
## 5 Saguinus            5.82  0.9990482219999… 4.36193870000… Asym            9.96
## 6 Saguinus            0.533 NA               NA             Asym            6.20
## # ℹ 8 more variables: Meandf <dbl>, MeanNSL <dbl>, relFLldMS <chr>,
## #   relHLldMS <chr>, PercAerial <dbl>, MeanLphsMS <chr>, SubstrateDisp <chr>,
## #   MovClip <chr>

Here are the definitions presented by Dunham et al. (2020b) for the variables in their published dataset:

variablematrix <- matrix(c("relBrnchDiamNsTl","Sin_SbstOrient","Cos_SbstOrient","Gait Type","relSpeed","Meandf","MeanNSL","relFLldMS","relHLldMS","PercAerial","MeanLphsMS","SubstrateDisp","MovClip","relative branch diameter","sine of substrate orientation angle","cosine of substrate orientation angle","asymmetrical gait vs. symmetrical gait","relative speed","mean duty factor","mean number of supporting limbs","relative forelimb lead duration","relative hindlimb lead duration","percent aerial phase","mean limb phase at midstance","substrate displacement","name of video clip"), ncol = 2, nrow = 13, byrow = FALSE)
(variablematrix)
##       [,1]               [,2]                                    
##  [1,] "relBrnchDiamNsTl" "relative branch diameter"              
##  [2,] "Sin_SbstOrient"   "sine of substrate orientation angle"   
##  [3,] "Cos_SbstOrient"   "cosine of substrate orientation angle" 
##  [4,] "Gait Type"        "asymmetrical gait vs. symmetrical gait"
##  [5,] "relSpeed"         "relative speed"                        
##  [6,] "Meandf"           "mean duty factor"                      
##  [7,] "MeanNSL"          "mean number of supporting limbs"       
##  [8,] "relFLldMS"        "relative forelimb lead duration"       
##  [9,] "relHLldMS"        "relative hindlimb lead duration"       
## [10,] "PercAerial"       "percent aerial phase"                  
## [11,] "MeanLphsMS"       "mean limb phase at midstance"          
## [12,] "SubstrateDisp"    "substrate displacement"                
## [13,] "MovClip"          "name of video clip"

Some variables that should be numeric were read in as character variables, so I needed to correct this.

class(d$relBrnchDiamNsTl) #Checking; relBrnchDiamNsTl is already numeric
## [1] "numeric"
class(d$Sin_SbstOrient) #Checking
## [1] "character"
d$Sin_SbstOrient <- as.numeric(d$Sin_SbstOrient) #Changing Sin_SbstOrient to numeric
class(d$Cos_SbstOrient) #Checking
## [1] "character"
d$Cos_SbstOrient <- as.numeric(d$Cos_SbstOrient) #Changing Cos_SbstOrient to numeric
class(d$relSpeed)#Checking; relSpeed is already numeric
## [1] "numeric"
class(d$Meandf)#Checking; Meandf is already numeric
## [1] "numeric"
class(d$MeanNSL) #Checking; MeanNSL is already numeric
## [1] "numeric"
class(d$relFLldMS) #Checking
## [1] "character"
d$relFLldMS <- as.numeric(d$relFLldMS) #Changing relFLldMS to numeric
class(d$relHLldMS) #Checking
## [1] "character"
d$relHLldMS <- as.numeric(d$relHLldMS) #Changing relHLldMS to numeric
class(d$PercAerial) #Checking; PercAerial is already numeric
## [1] "numeric"
class(d$MeanLphsMS) #Checking
## [1] "character"
d$MeanLphsMS <- as.numeric(d$MeanLphsMS) #Changing MeanLphsMS to numeric
class(d$SubstrateDisp) #Checking
## [1] "character"
d$SubstrateDisp <- as.numeric(d$SubstrateDisp) #Changing SubstrateDisp to numeric

Various kinematic and substrate variables and were calculated by Dunham et al. (2020) from the data available in their published dataset (pp. 3-4).

The variable relative branch diameter (relBrnchDiamNsTl) is a ratio of the branch diameter relative to primate body length measure from nose to base of tail (Dunham et al., 2020a). Dunham et al. (2020a) used these relative branch diameter ratios to group substrate into three size categories: small (“less than half of an individual’s trunk diameter”), medium (“between half and equal to an individual’s trunk diameter”), and large (“greater than an individual’s trunk diameter”) (p. 4). Here, I am adding a new variable to my dataset, relBrnchDiamCategories, to categorize substrate for each observation as described above.

Here are the levels in relBrnchDiamCategories:

d <- d |> mutate(relBrnchDiamCategories = case_when( #Creating new variable categorizing relative substrate diameter
  relBrnchDiamNsTl >= 0.5 & relBrnchDiamNsTl <= 1 ~ "Medium",
  relBrnchDiamNsTl > 1 ~ "Large",
  relBrnchDiamNsTl < 0.5 ~ "Small",
  TRUE ~ NA
  ))
## mutate: new variable 'relBrnchDiamCategories' (character) with 3 unique values and 0% NA
d$relBrnchDiamCategories <- as.factor(d$relBrnchDiamCategories) #As factor because this is a categorical variable with levels.
(levels(d$relBrnchDiamCategories))
## [1] "Large"  "Medium" "Small"

The variables sine of substrate orientation angle (Sin_SbstOrient) and cosine of substrate orientation angle (Cos_SbstOrient) are transformations of the raw substrate orientation angle measurement in degrees (Dunham et al., 2020a, p. 4). Dunham et al. (2020a) used these to group substrate into three orientation categories: horizontal (“between -30 and 30 deg”), decline (“less than -30 deg”), and incline (“greater than 30 deg”) (p. 4). Here, I am adding a new variable to my dataset, SbstOrientCategories, to categorize substrate for each observation as described above. The choice to use Sin_SbstOrient to calculate orientation angle is because, unlike Cos_SbstOrient, it can differentiate declines and inclines (Dunham et al., 2020a).

Here are the levels in SbstOrientCategories:

d$Angle_SbstOrient <- (asin(d$Sin_SbstOrient))*(180/pi) #Converting sin to angle (deg)

d <- d |> mutate(SbstOrientCategories = case_when( #Creating new variable categorizing substrate orientation
  Angle_SbstOrient >= -30 & Angle_SbstOrient <= 30 ~ "Horizontal",
  Angle_SbstOrient < -30 ~ "Decline",
  Angle_SbstOrient > 30 ~ "Incline",
  TRUE ~ NA
  ))
## mutate: new variable 'SbstOrientCategories' (character) with 4 unique values and 24% NA
d$SbstOrientCategories <- as.factor(d$SbstOrientCategories) #As factor because this is a categorical variable with levels.
d$SbstOrientCategories <- fct_relevel(d$SbstOrientCategories,"Horizontal","Incline","Decline") #Reordering the levels to later better replicate Table 1
(levels(d$SbstOrientCategories))
## [1] "Horizontal" "Incline"    "Decline"

Because only Genus in included as a column in the published dataset, I decided to add columns both for species (Species) and binomial nomenclature (Binomial).

d$Genus <- as.factor(d$Genus)

d <- d |> mutate(Species = case_when( #Creating new variable, Species
  Genus == "Cebuella" ~ "pygmaea",
  Genus == "Saguinus" ~ "tripartitus"
  ))
## mutate: new variable 'Species' (character) with 2 unique values and 0% NA
d <- d |> mutate(Binomial = paste(Genus, Species, sep = " ")) #Creating new variable, Binomial, by concatenating **Genus** and **Species** using paste()
## mutate: new variable 'Binomial' (character) with 2 unique values and 0% NA
d$Binomial <- as.factor(d$Binomial)
d$Binomial <- fct_relevel(d$Binomial,"Saguinus tripartitus","Cebuella pygmaea")

Dunham et al. (2020a) define the variables relative forelimb lead duration (relFLldMS) and relative hindlimb lead duration (relHLldMS) represent “the interval between touchdowns within a limb girdle divided by the total contact duration of the limb girdle” (p. 4). Dunham et al. (2020a) used these to differentiate symmetrical gaits (“temporal lag between the left and right limbs in each girdle amounted to 50±10% of stride duration”) from asymmetrical gaits (values outside the symmetrical gait range) (p. 3). Within asymmetrical gaits, these variables were additionally used to differentiate bounds (both forelimb and hindlimb stance periods were nearly simultaneous), half-bounds (hindlimb stance periods were nearly simultaneous but forelimb stance periods were staggered), and gallops (both forelimb and hindlimb stance periods were staggered) (p. 3).

Here, I am first adding two new columns to my dataset, FL and HL, to store whether forelimb touchdowns and hindlimb touchdowns, respectively, were simultaneous (“the interval between the trailing and leading limb mid-support was ≤10% of total limb pair contact duration”) or staggered (“the interval between the trailing and leading limb mid-support was ≥10% of total limb pair contact duration”) for each observation (Dunham et al., 2020a, p. 3). Following from this, I am adding a new variable, GaitType2, which categorizes each observation as a symmetrical gait, a bound, a half-bound, or a gallop (p. 3).

d <- d |> #Categorizing forelimb stance periods as nearly simultaneous or staggered based on relative forelimb lead duration, relFLldMS
  mutate(FL = case_when(
    relFLldMS <= 0.1 ~ "simultaneous",
    relFLldMS > 0.1 ~ "staggered"
  ))
## mutate: new variable 'FL' (character) with 3 unique values and 5% NA
d <- d |> #Categorizing hindlimb stance periods as nearly simultaneous or staggered based on relative hindlimb lead duration, relHLldMS
  mutate(HL = case_when(
    relHLldMS <= 0.1 ~ "simultaneous",
    relHLldMS > 0.1 ~ "staggered"
  ))
## mutate: new variable 'HL' (character) with 3 unique values and 5% NA
d <- d |> mutate(GaitType2 = case_when( #Categorizing gait types based on both forelimb and hindlimb stance periods
  `Gait Type` == "Sym" ~ "Symmetrical",
  FL == "simultaneous" & HL == "simultaneous" ~ "Bound",
  FL == "staggered" & HL == "simultaneous" ~ "Half-bound",
  FL == "staggered" & HL == "staggered" ~ "Gallop"
))
## mutate: new variable 'GaitType2' (character) with 5 unique values and 1% NA
gait <- table(d$Binomial,d$GaitType2)
gaittable <- kable(gait, caption = "Number of strides for different gait types in free-ranging <i>Saguinus tripartitus</i> and <i>Cebuella pygmaea</i>")|> #Adding a title
        add_header_above(c(" " = 1, "Gait Type" = 4))|> #Adding subtitles over the specified groups of columns
        kable_styling(full_width = FALSE) |> #Fixing the spacing of the columns, which were crowded together without this specification
        row_spec(0, extra_css = "margin-bottom: 10px;") |> #Fixing the spacing of the rows, which were crowded together without this specification
        column_spec(1, italic = TRUE) #Italicizing the scientific names pulled from the variable Binomial

(gaittable)
Number of strides for different gait types in free-ranging Saguinus tripartitus and Cebuella pygmaea
Gait Type
Bound Gallop Half-bound Symmetrical
Saguinus tripartitus 19 58 17 11
Cebuella pygmaea 94 0 5 2

Descriptive Statistics

Here, I attempting to replicate Table 1, which summarizes the number of strides (n=209) for different substrate diameter (small, medium, large) and orientation categories (decline, horizontal, incline) in free-ranging S. tripartitus (n=108) and C. pygmaea (n=101) (p. 4). Dunham et al. (2020a) clarify that each observation in the dataset represents one stride (n = 209) (p. 3).

tab1a <- table(d$Binomial,d$relBrnchDiamCategories) #Creating a table summarizing the frequency of each substrate diameter category across observations of Saguinus and Cebuella strides
tab1b <- table(d$Binomial,d$SbstOrientCategories) #Creating a table summarizing the frequency of each substrate orientation category across observations of Saguinus and Cebuella strides
tab1 <- cbind(tab1a,tab1b) #Combining the above two tables to more accurately replicate Table 1
tab1 <- kable(tab1, caption = "Table 1. Number of strides for different substrate diameter and orientation categories in free-ranging <i>Saguinus tripartitus</i> and <i>Cebuella pygmaea</i>")|> #Adding a title
        add_header_above(c(" " = 1, "Substrate diameter" = 3, "Substrate orientation" = 3))|> #Adding subtitles over the specified groups of columns
        kable_styling(full_width = FALSE) |> #Fixing the spacing of the columns, which were crowded together without this specification
        row_spec(0, extra_css = "margin-bottom: 10px;") |> #Fixing the spacing of the rows, which were crowded together without this specification
        column_spec(1, italic = TRUE) #Italicizing the scientific names pulled from the variable Binomial
(tab1)
Table 1. Number of strides for different substrate diameter and orientation categories in free-ranging Saguinus tripartitus and Cebuella pygmaea
Substrate diameter
Substrate orientation
Large Medium Small Horizontal Incline Decline
Saguinus tripartitus 3 7 98 32 18 7
Cebuella pygmaea 39 41 21 7 73 21

Here is Table 1 for comparison (Dunham et al., 2020a, p.4):

My values for the number of strides observed for each substrate orientation category match exactly. However, looking instead at substrate diameter, while the stride numbers sum to the correct number of observations for S. tripartitus (n=108) and C. pygmaea (n=101), my distribution of strides across substrate diameter categories does not match Table 1 (Dunham et al., 2020a). Dunham et al. (2020a) state that the variable relative branch diameter (relBrnchDiamNsTl) is a ratio of the branch diameter relative to primate body length measure from nose to base of tail (p. 4). Because a small substrate was defined as one “less than half of an individual’s trunk diameter,” I assigned this category to observations where relBrnchDiamNst < 0.5 (Dunham et al., 2020a, p. 4). Because a medium substrate was defined as one “between half and equal to an individual’s trunk diameter,” I assigned this category to observations where 0.5 >= relBrnchDiamNst <= 1 (Dunham et al., 2020a, p. 4). Lastly, because a large substrate was defined as one “between half and equal to an individual’s trunk diameter,” I assigned this category to observations where relBrnchDiamNst > 1 (Dunham et al., 2020a, p. 4). Therefore, while I believe my assignment of these categories matches their presented definitions, there appears to be some discrepancy between how I assigned these categories and how they were assigned by Dunham et al. (2020a). One potential explanation is that relBrnchDiamNst in the published dataset is likely a Box–Cox transformed value, as other variables in the published dataset clearly are already transformed according to the methods outlined by Dunham et al.’s (2020a) in the “Statistical analyses” section (p. 4). I unfortunately cannot revert these values back to their raw values to test if this is the issue because the lambda value used as the transformation parameter is not specified.

Inferential Statistical Analysis

Linear Mixed Models

Dunham et al. (2020a) ran linear mixed model regressions in order to evaluate the effects of substrate diameter and orientation on various kinematic variables of the gait of S. tripartitus and C. pygmaea individuals, such as:

  • Relative speed
  • Mean duty factor
  • Mean number of supporting limbs
  • Relative forelimb lead duration
  • Relative hindlimb lead duration
  • Percent aerial phase.

Raw measures of substrate orientation angle were transformed into [1] the sine of the substrate orientation angle, and [2] the cosine of the substrate orientation angle. The sine value allowed the authors to differentiate declines and inclines in addition to horizontal substrates, while the cosine value allowed the authors to more generally look at horizontal substrates and oblique (declined or inclined) substrates.

The packages {lme4} and {lmerTest} were used to carry out their linear mixed models in R in order to utilize Sattherthwaite approximations to adjust degrees of freedom (Dunham et al., 2020a).

In {lme4}, the function lmer() is used to carry out a linear mixed model regression. The general syntax of the function lmer() is:

lmer(response_variable ~ fixed_effects + (random_effects | group_variable))

For example, if I am looking to evaluate the effect of substrate diameter and substrate orientation on mean duty factor…

  • Mean duty factor (MeandF) is my response variable
  • My fixed effects are substrate diameter (relBrnchDiamNsTl) and substrate orientation (Sin_SbstOrient and Cos_SbstOrient)
  • Dunham et al. (2020a) also included relative speed (relSpeed) as a covariate for each kinematic variables to account for the potential effects of speed (p. 4), so this would be added to the fixed effects argument.
  • Dunham et al. (2020a) specified that “individual primate was nested within video clip as a random factor (intercept) in each model” because some video clips (MovClip) contained multiple individuals (p. 4).

All together, this code would look like:

lmer(Meandf ~ relSpeed + relBrnchDiamNsT1 + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip))

Effect of substrate diameter and substrate orientation on relative speed for S. tripartitus

d$MovClip <- as.factor(d$MovClip)

saguinus <- d |> filter(Genus == "Saguinus")
## filter: removed 101 rows (48%), 108 rows remaining
m1s <- lmer(relSpeed ~ relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = saguinus, na.action = na.omit)
(anova(m1s))
## Type III Analysis of Variance Table with Satterthwaite's method
##                   Sum Sq Mean Sq NumDF  DenDF F value  Pr(>F)  
## relBrnchDiamNsTl 2.14587 2.14587     1 52.019  6.5525 0.01342 *
## Sin_SbstOrient   0.02589 0.02589     1 51.734  0.0791 0.77970  
## Cos_SbstOrient   1.73891 1.73891     1 51.818  5.3098 0.02525 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the F-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model there is a significant effect of substrate diameter and cosine of substrate orientation on relative speed, while there is not a significant correlation between sine of substrate orientation and relative speed.

Effect of substrate diameter and substrate orientation on relative speed for C. pygmaea

cebuella <- d |> filter(Genus == "Cebuella")
## filter: removed 108 rows (52%), 101 rows remaining
m1c <- lmer(relSpeed ~ relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = cebuella, na.action = na.omit)
(anova(m1c))
## Type III Analysis of Variance Table with Satterthwaite's method
##                  Sum Sq Mean Sq NumDF  DenDF F value   Pr(>F)   
## relBrnchDiamNsTl  3.459   3.459     1 69.601  0.6103 0.437317   
## Sin_SbstOrient    7.894   7.894     1 55.276  1.3926 0.243018   
## Cos_SbstOrient   49.209  49.209     1 63.925  8.6817 0.004481 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the F-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model there is a significant correction between cosine of substrate orientation and relative speed, while substrate diameter and sine of substrate orientation do not have a significant effect on relative speed.

Effect of substrate diameter and substrate orientation on mean duty factor for S. tripartitus

#For Saguinus, effect of substrate diameter and substrate orientation on mean duty factor
m2s <- lmer(Meandf ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = saguinus, na.action = na.omit)
(anova(m2s))
## Type III Analysis of Variance Table with Satterthwaite's method
##                    Sum Sq  Mean Sq NumDF  DenDF F value    Pr(>F)    
## relSpeed         0.040549 0.040549     1 51.249 26.0469 4.933e-06 ***
## relBrnchDiamNsTl 0.003152 0.003152     1 50.820  2.0250   0.16084    
## Sin_SbstOrient   0.001442 0.001442     1 47.671  0.9264   0.34065    
## Cos_SbstOrient   0.006897 0.006897     1 48.728  4.4302   0.04049 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the F-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model relative speed and cosine of substrate orientation each have a significant effect on mean duty factor, while sine of substrate orientation does not have a significant effect on mean duty factor. However, my replication and the original model differ in that mine does not support a significant correlation between substrate diameter and mean duty factor while the original model did find this to be significant.

Effect of substrate diameter and substrate orientation on mean duty factor for C. pygmaea

#For Cebuella, effect of substrate diameter and substrate orientation on mean duty factor
m2c <- lmer(Meandf ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = cebuella, na.action = na.omit)
(anova(m2c))
## Type III Analysis of Variance Table with Satterthwaite's method
##                    Sum Sq  Mean Sq NumDF  DenDF F value    Pr(>F)    
## relSpeed         0.289702 0.289702     1 91.184 74.9226 1.633e-13 ***
## relBrnchDiamNsTl 0.000115 0.000115     1 70.420  0.0298  0.863506    
## Sin_SbstOrient   0.040149 0.040149     1 55.416 10.3833  0.002131 ** 
## Cos_SbstOrient   0.000121 0.000121     1 69.675  0.0314  0.859951    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the F-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model relative speed and sine of substrate orientation each have a significant effect on mean duty factor, while substrate diameter and cosine of substrate orientation do not have a significant effect on mean duty factor.

Effect of substrate diameter and substrate orientation on mean number of supporting limbs for S. tripartitus

m3s <- lmer(MeanNSL ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = saguinus, na.action = na.omit)
(anova(m3s))
## Type III Analysis of Variance Table with Satterthwaite's method
##                     Sum Sq   Mean Sq NumDF  DenDF F value    Pr(>F)    
## relSpeed         0.0222042 0.0222042     1 44.165 18.2787 0.0001004 ***
## relBrnchDiamNsTl 0.0007352 0.0007352     1 51.347  0.6052 0.4401612    
## Sin_SbstOrient   0.0006076 0.0006076     1 50.951  0.5002 0.4826455    
## Cos_SbstOrient   0.0037459 0.0037459     1 51.428  3.0837 0.0850367 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the F-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model relative speed is significantly correlated with mean number of supporting limbs, while sine of substrate orientation does not have a significant effect. However, my replication and the original model differ in that mine does not support a significant effect by substrate diameter and cosine of substrate orientation on mean number of supporting limbs while the original model did find these both to be significant.

Effect of substrate diameter and substrate orientation on mean number of supporting limbs for C. pygmaea

#For Cebuella, effect of substrate diameter and substrate orientation on mean number of supporting limbs
m3c <- lmer(MeanNSL ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = cebuella, na.action = na.omit)
(anova(m3c))
## Type III Analysis of Variance Table with Satterthwaite's method
##                  Sum Sq Mean Sq NumDF  DenDF F value    Pr(>F)    
## relSpeed         3.9458  3.9458     1 91.107 60.7371 1.017e-11 ***
## relBrnchDiamNsTl 0.0007  0.0007     1 69.623  0.0105   0.91853    
## Sin_SbstOrient   0.4032  0.4032     1 54.555  6.2058   0.01581 *  
## Cos_SbstOrient   0.0060  0.0060     1 68.851  0.0920   0.76251    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the F-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model, relative speed is significantly correlated with mean number of supporting limbs, while branch diameter and cosine of substrate orientation each do not have a significant effect. However, my replication and the original model differ in that mine also shows a significant correlation between sine of substrate orientation and mean number of supporting limbs while the original model did not find this to be significant.

Effect of substrate diameter and substrate orientation on percent aerial phase for S. tripartitus

#For Saguinus, effect of substrate diameter and substrate orientation on percent aerial phase
m6s <- lmer(PercAerial ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = saguinus, na.action = na.omit)
(anova(m6s))
## Type III Analysis of Variance Table with Satterthwaite's method
##                      Sum Sq    Mean Sq NumDF  DenDF F value   Pr(>F)   
## relSpeed         0.00078310 0.00078310     1 44.965 10.8422 0.001938 **
## relBrnchDiamNsTl 0.00033021 0.00033021     1 51.327  4.5719 0.037282 * 
## Sin_SbstOrient   0.00011170 0.00011170     1 50.938  1.5465 0.219343   
## Cos_SbstOrient   0.00000867 0.00000867     1 51.403  0.1201 0.730346   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the F-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model, substrate diameter is significantly correlated with percent aerial phase, while cosine and sine of substrate orientation did not have a significant effect. However, my replication and the original model differ in that mine also shows a significant correlation between relative speed and percent aerial phase while the original model did not find this to be significant.

Effect of substrate diameter and substrate orientation on percent aerial phase for C. pygmaea

#For Cebuella, effect of substrate diameter and substrate orientation on percent aerial phase
m6c <- lmer(PercAerial ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = cebuella, na.action = na.omit)
(anova(m6c))
## Type III Analysis of Variance Table with Satterthwaite's method
##                   Sum Sq Mean Sq NumDF  DenDF F value   Pr(>F)    
## relSpeed         0.31976 0.31976     1 89.988 37.2361 2.59e-08 ***
## relBrnchDiamNsTl 0.00312 0.00312     1 73.917  0.3638 0.548239    
## Sin_SbstOrient   0.08636 0.08636     1 56.793 10.0571 0.002447 ** 
## Cos_SbstOrient   0.00999 0.00999     1 73.334  1.1634 0.284293    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the F-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model, relative speed and sine of substrate orientation are significantly correlated with percent aerial phase, while substrate diameter and cosine of substrate orientation did not have a significant effect.

Binary logistic mixed-effects models

Rather than the above method, Dunham et al. (2020a) used binary logistic mixed-effects models for two models because the residual error for these two models was not normally distributed, meaning that they did not “satisfy assumptions of linear mixed regression analysis” (p. 4). These two models were:

  • The effect of substrate diameter and substrate orientation on relative forelimb lead duration for C. pygmaea
  • The effect of substrate diameter and substrate orientation on relative hindlimb lead duration for C. pygmaea

Unfortunately, when running the following code while replicating the linear mixed models for [1] the effect of substrate diameter and substrate orientation on relative forelimb lead duration for S. tripartitus, and [2] the effect of substrate diameter and substrate orientation on relative hindlimb lead duration for S. tripartitus, the lmer() function repeatedly returned an error despite using the same syntax as for all other models:

[1] lmer(relFLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = saguinus, na.action = na.omit)

[2] lmer(relHLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient + (1|MovClip), data = saguinus, na.action = na.omit)

Specifically, the error stated that the number of levels of each grouping factor must be < number of observations. Given that (1|MovClip) as a random effect did not return this error for any other model created using this method, I was initially unsure what the continued cause of this error is. I decided to look at the number of levels of the grouping factor (MovClip) and the number of observations for each respective response variable ([1] relFLldMS, and [2] relHLldMS) for S. tripartitus.

The number of levels of the grouping factor, MovClip, is:

(length(levels(saguinus$MovClip)))
## [1] 172

The number of usable observations of [1] relFLldMS for Saguinus tripartitus is:

(sum(!is.na(saguinus$relFLldMS)))
## [1] 97

The number of usable observations of [2] relHLldMS for Saguinus tripartitus is:

(sum(!is.na(saguinus$relHLldMS)))
## [1] 97

If my understanding of the error message is correct, it would seem that the number of dropped NAs for relFLldMS and relHLldMS for S. tripartitus may be responsible for the lmer() function not working for attempted models [1] and [2]. I am not certain of this because in total there were 108 observations for S. tripartitus (Dunham et al., 2020a; Dunham et al., 2020b), which is also less than the number of levels in MovClip, so I am unsure why this error would not arise in the case of the other models as well unless I am misinterpreting its meaning.

For attempted models [1] and [2], I ultimately decided to instead use binary logistic mixed-effects models instead of linear mixed models in case the issue was that the residual error for these two models was not normally distributed, as was the case for the Cebuella pygmaea models for these same response variables.

Effect of substrate diameter and substrate orientation on relative forelimb lead duration for S. tripartitus

m4s <- glm(formula = relFLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient,
       family = "binomial", data = saguinus)
(summary(m4s))
## 
## Call:
## glm(formula = relFLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + 
##     Cos_SbstOrient, family = "binomial", data = saguinus)
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)
## (Intercept)      -0.08933    1.57989  -0.057    0.955
## relSpeed         -0.12652    0.13948  -0.907    0.364
## relBrnchDiamNsTl -2.54274    3.10631  -0.819    0.413
## Sin_SbstOrient   -0.34138    0.90450  -0.377    0.706
## Cos_SbstOrient    0.18657    1.88343   0.099    0.921
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 8.3199  on 47  degrees of freedom
## Residual deviance: 4.8859  on 43  degrees of freedom
##   (60 observations deleted due to missingness)
## AIC: 44.493
## 
## Number of Fisher Scoring iterations: 7

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the z-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model, cosine and sine of substrate orientation each were not significantly correlated with relative forelimb lead duration. However, my replication and the original model differ in that mine does not show a significant correlation between either relative speed or substrate diameter with relative forelimb limb duration while the original model did find these to be significant effects.

Effect of substrate diameter and substrate orientation on relative forelimb lead duration for C. pygmaea

m4c <- glm(formula = relFLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient,
       family = "binomial", data = cebuella)
(summary(m4c))
## 
## Call:
## glm(formula = relFLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + 
##     Cos_SbstOrient, family = "binomial", data = cebuella)
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)
## (Intercept)       -2.5466     2.3626  -1.078    0.281
## relSpeed          -0.2299     0.1566  -1.467    0.142
## relBrnchDiamNsTl  -0.4058     1.8535  -0.219    0.827
## Sin_SbstOrient    -1.5563     1.0255  -1.518    0.129
## Cos_SbstOrient     3.6009     3.5836   1.005    0.315
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 11.2583  on 100  degrees of freedom
## Residual deviance:  3.8306  on  96  degrees of freedom
## AIC: 16.239
## 
## Number of Fisher Scoring iterations: 8

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the z-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model, there is no significant effect of relative speed, substrate diameter, cosine of substrate orientation, or sine of substrate orientation on relative forelimb lead duration.

Effect of substrate diameter and substrate orientation on relative hindlimb lead duration for S. tripartitus

m5s <- glm(formula = relHLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient,
       family = "binomial", data = saguinus)
(summary(m5s))
## 
## Call:
## glm(formula = relHLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + 
##     Cos_SbstOrient, family = "binomial", data = saguinus)
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)
## (Intercept)      -1.83157    2.03080  -0.902    0.367
## relSpeed         -0.18570    0.16768  -1.107    0.268
## relBrnchDiamNsTl -1.87143    3.33825  -0.561    0.575
## Sin_SbstOrient   -0.02364    1.02433  -0.023    0.982
## Cos_SbstOrient    1.96770    2.34108   0.841    0.401
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 7.8903  on 47  degrees of freedom
## Residual deviance: 4.3626  on 43  degrees of freedom
##   (60 observations deleted due to missingness)
## AIC: 30.66
## 
## Number of Fisher Scoring iterations: 8

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the z-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model, sine of substrate orientation was not significantly correlated with relative hindlimb lead duration. However, my replication and the original model differ in that mine does not show a significant correlation of relative speed, substrate diameter, and cosine of substrate orientation with relative hindlimb limb duration while the original model did find these to be significant effects.

Effect of substrate diameter and substrate orientation on relative hindlimb lead duration for C. pygmaea

m5c <- glm(formula = relHLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + Cos_SbstOrient,
       family = "binomial", data = cebuella)
(summary(m5c))
## 
## Call:
## glm(formula = relHLldMS ~ relSpeed + relBrnchDiamNsTl + Sin_SbstOrient + 
##     Cos_SbstOrient, family = "binomial", data = cebuella)
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)
## (Intercept)       -1.0631     4.1786  -0.254    0.799
## relSpeed          -0.3912     0.3017  -1.297    0.195
## relBrnchDiamNsTl  -1.4242     4.2487  -0.335    0.737
## Sin_SbstOrient    -0.5052     1.7715  -0.285    0.775
## Cos_SbstOrient     1.3379     6.2821   0.213    0.831
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 7.3550  on 100  degrees of freedom
## Residual deviance: 1.8152  on  96  degrees of freedom
## AIC: 12.811
## 
## Number of Fisher Scoring iterations: 10

Highlighted below for comparison are the relevant results presented in Dunham et al. (2020a) Table 2 (p. 6):

While the z-values, degrees of freedom, and p-values are not exact matches, in both my replication and the original model, there is no significant effect of relative speed, substrate diameter, cosine of substrate orientation, or sine of substrate orientation on relative hindlimb lead duration.

Visualization

For my visualization, I will be attempting to replicate Dunham et al. (2020a) Figure 3 (p. 7), which consists of plots of various kinematic variables (mean duty factor, mean number of supporting limbs, relative forelimb lead duration, relative hindlimb lead duration, and percent aerial phase) regressed against relative speed for S. tripartitus and C. pygmaea.

First, I will create and print the plot for each kinematic variable independently for closer comparison with its Dunham et al. (2020a) Figure 3 counterpart.

Effect of relative speed on mean duty factor for S. tripartitus and C. pygmaea

p1 <- ggplot(data = d, aes(x = relSpeed, y = Meandf), color = Genus) +
  geom_point(aes(fill = Genus), alpha = 0.5) +
  geom_smooth(method="lm", formula = y ~ x, aes(color = Genus)) + #Adding regression lines
  scale_color_manual(values = c("Saguinus" = "gray70", "Cebuella" = "black")) + #Setting colors that match Figure 3
  scale_fill_manual(values = c("Saguinus" = "gray70", "Cebuella" = "black")) +
  labs(y = "Mean DF", x = element_blank()) + #Specifying axis labels
  ylim(0,1) + #Setting the y-axis range
  xlim(0,5) + #Setting the x-axis range
  theme(legend.background = element_rect(fill = NA), #Formatting the legend
        legend.title=element_blank(), 
        legend.position = c(0.1, 0.99), 
        legend.justification = c("left", "top"), 
        panel.background = element_blank(), #Formatting the background
        panel.border = element_rect(colour = "black", fill = NA, size = 1),
        panel.grid=element_line("gray92"))

(p1)

From Dunham et al. (2020a) Figure 3:

In my replication, the Cebuella pygmaea trend exhibits a higher mean duty factor than Saguinus tripartitus at any given relative speed. The regression slopes for both species look approximately equal. However, in the original figure, the regression line for S. tripartitus is much steeper than in my plot and intersects with the C. pygmaea regression line at around a relative speed of 3. My plot appears to be much less populated with points representing unique observations, likely due to the high quantity of dropped NAs from relSpeed and Meandf and/or observations that are missing from the published version of the dataset.

Effect of relative speed on mean number of supporting limbs for S. tripartitus and C. pygmaea

p2 <- ggplot(data = d, aes(x = relSpeed, y = MeanNSL), color = Genus) +
  geom_point(aes(fill = Genus), alpha = 0.5) + 
  geom_smooth(method = "lm", formula = y ~ x, aes(color = Genus)) + #Adding regression lines
  scale_color_manual(values = c("Saguinus" = "gray70", "Cebuella" = "black")) + #Setting colors that match Figure 3
  scale_fill_manual(values = c("Saguinus" = "gray70", "Cebuella" = "black")) +
  labs(y = "Mean NSL", x = element_blank()) + #Specifying axis labels
  ylim(0,4) + #Setting the y-axis range
  xlim(0,5) + #Setting the x-axis range
  theme(legend.position = "none", #Removing the legend
        panel.background = element_blank(), #Formatting the background
        panel.border = element_rect(colour = "black", fill = NA, size = 1),
        panel.grid=element_line("gray92"))

(p2)

From Dunham et al. (2020a) Figure 3:

In both my replication and the original figure, C. pygmaea appears to have a higher mean number of supporting limbs than S. tripartitus at any given relative speed and the regression slopes for both species look approximately equal. However, in the original figure, the regression lines for both species appear steeper than in my plot. Again, my plot appears to be much less populated with points representing observations, likely due to the high quantity of dropped NAs from relSpeed and MeanNSL and/or observations that are missing from the published version of the dataset.

Effect of relative speed on relative forelimb lead duration for S. tripartitus and C. pygmaea

p3 <- ggplot(data = d, aes(x = relSpeed, y = relFLldMS), color = Genus) +
  geom_point(aes(fill = Genus), alpha = 0.5) + 
  geom_smooth(method= "lm", formula = y ~ x, aes(color = Genus)) + #Adding regression lines
  scale_color_manual(values = c("Saguinus" = "gray50", "Cebuella" = "black")) + #Setting colors that match Figure 3
  scale_fill_manual(values = c("Saguinus" = "gray70", "Cebuella" = "black")) +
  labs(y = "Relative FLLD", x = element_blank()) + #Specifying axis labels
  ylim(-0.2,0.6) + #Setting the y-axis range
  xlim(0,5) + #Setting the x-axis range
  theme(legend.position = "none", #Removing the legend
        panel.background = element_blank(), #Formatting the background
        panel.border = element_rect(colour = "black", fill = NA, size = 1),
        panel.grid=element_line("gray92"))

(p3)

From Dunham et al. (2020a) Figure 3:

My replication and the original figure look starkly different. In my replication, the regression line for C. pygmaea shows a steep slope for the negative relationship between relative forelimb lead duration and relative speed, while the regression line for S. tripartitus shows a nearly horizontal slope, suggesting relative forelimb lead duration is fairly constant across relative speeds. This is exactly opposite the trends shown in the original figure. Again, my plot appears to be much less populated with points representing observations, likely due to the high quantity of dropped NAs from relSpeed and relFLldMS and/or observations that are missing from the published version of the dataset.

Effect of relative speed on relative hindlimb lead duration for S. tripartitus and C. pygmaea

p4 <- ggplot(data = d, aes(x = relSpeed, y = relHLldMS), color = Genus) +
  geom_point(aes(fill = Genus), alpha = 0.5) + 
  geom_smooth(method= "lm", formula = y ~ x, aes(color = Genus)) + #Adding regression lines
  scale_color_manual(values = c("Saguinus" = "gray50", "Cebuella" = "black")) + #Setting colors that match Figure 3
  scale_fill_manual(values = c("Saguinus" = "gray70", "Cebuella" = "black")) +
  labs(y = "Relative HLLD", x = "Relative speed") + #Specifying axis labels
  ylim(-0.2,0.6) + #Setting the y-axis range
  xlim(0,5) + #Setting the x-axis range
  theme(legend.position = "none", #Removing the legend
        panel.background = element_blank(), #Formatting the background
        panel.border = element_rect(colour = "black", fill = NA, size = 1),
        panel.grid=element_line("gray92"))

(p4)

From Dunham et al. (2020a) Figure 3:

As with the relFLldMS ~ relSpeed plot, my replication and the original figure look starkly different. In my replication, the regression line for C. pygmaea shows a steep slope for the negative relationship between relative hindlimb lead duration and relative speed, while the regression line ofr S. tripartitus shows a nearly horizontal slope, suggesting relative hindlimb lead duration is fairly constant across relative speeds. Same as with the relFLldMS ~ relSpeed plot, this is exactly opposite the trends shown in the original figure. Again, my plot appears to be much less populated with points representing observations, likely due to the high quantity of dropped NAs from relSpeed and relHLldMS and/or observations that are missing from the published version of the dataset.

Effect of relative speed on percent aerial phase for S. tripartitus and C. pygmaea

p5 <- ggplot(data = d, aes(x = relSpeed, y = PercAerial, color = Genus)) +
  geom_point(aes(fill = Genus), alpha = 0.5) + 
  geom_smooth(method= "lm", formula = y ~ x, aes(color = Genus)) + #Adding regression lines
  scale_color_manual(values = c("Saguinus" = "gray50", "Cebuella" = "black")) + #Setting colors that match Figure 3
  scale_fill_manual(values = c("Saguinus" = "gray70", "Cebuella" = "black")) +
  labs(caption = "Fig. 3. Variation in callitrichine asymmetrical gait kinematics (n=209 strides) in relation to relative speed", y = "% Aerial phase", x = "Relative speed") + #Specifying axis labels
  ylim(0,0.6) + #Setting the y-axis range
  xlim(0,5) + #Setting the x-axis range
  theme(legend.position = "none", #Removing the legend
        panel.background = element_blank(), #Formatting the background
        panel.border = element_rect(colour = "black", fill = NA, size = 1),
        panel.grid = element_line("gray92"),
        plot.caption = element_text(hjust=0.5, size=rel(0.5), face = "bold"))

(p5)

From Dunham et al. (2020a) Figure 3:

Out of all of the plots I attempted to replicate for Dunham et al. (2020a) Figure 3, this plot was arguably the least successful. In the original figure, both S. tripartitus and C. pygmaea show a positive relationship between relative speed and percent aerial phase, with the C. pygmaea trend exhibiting a higher percent aerial phase than S. tripartitus at any given relative speed. In contrast, the points on my plot are distributed very differently and the regression lines have very different slopes than those in the original figure. Specifically, while the C. pygmaea line shows a slight positive relationship, the S. tripartitus line has a nearly horizontal slope. As with all of the previous regression plots, my verison of the PercAerial ~ relSpeed plot appears to be much less populated with points representing observations, likely due to the high quantity of dropped NAs from relSpeed and PercAerial and/or observations that are missing from the published version of the dataset.

Full replication of Figure 3:

#Assembling Fig 3
p1andp2 <- plot_grid(p1, p2, ncol = 2, nrow = 1, align = "hv", rel_widths = c(1,1))
p3andp4 <- plot_grid(p3, p4, ncol = 2, nrow = 1, align = "hv", rel_widths = c(1,1))
p5andNULL <- plot_grid(p5, NULL, ncol = 2, nrow = 1, align = "hv", rel_widths = c(1,1))
#Printing Fig 3

Here is the Dunham et al. (2020a) full Figure 3 for comparison (p. 7):

Summary/Discussion

My descriptive statistics replication was relatively successful. The number of strides for different substrate orientation categories in free-ranging S. tripartitus and C. pygmaea in my table exactly matches that of Dunham et al. (2020a) Table 1. Still, the number of strides for different substrate diameter categories in my replication did not match Dunham et al. (2020a) Table 1. Despite following the definitions outlined by the author, this may be due to a discrepancy between how I assigned these substrate diameter categories and how they were assigned by Dunham et al. (2020a).

For the inferential statistics component of this assignment, my linear mixed models and binary logistic mixed-effects models were relatively successful in replicating the results summarized in Dunham et al. (2020a) Table 2. While the F-values (or Z-values for the binary mixed-effects models), degrees of freedom, and p-values were not exact matches, my replication found many of the same statistically significant correlations between a given kinematic variable and relative speed, substrate diameter, and/or substrate orientation. However, my replication was far from a perfect match, with various instances of a statistically significant effect being found where none had been in the original model, or a lack of statistical significance was found where a significant effect was found in the original model. These discrepancies may be due to data that was used by Dunham et al. (2020a) in their analyses but was not included in the published dataset (2020b).

Furthermore, because the raw angle measurement for substrate orientation is not a variable in the dataset, but the sine and cosine transformation outlined in the article are, I have made the assumption throughout this exercise that for all variables for which transformations were described, the values in the dataset have already undergone said transformations. I cannot confirm this due to the absence of separately specified columns of raw measurements and transformations of the variables in the published dataset.

Finally, for the visualization component of this assignment, my replication did not successfully match Dunham et al. (2020a) Figure 3. Across the plots of each kinematic variable regressed against relative speed, I noticed that my plots were far less populated with data points than in the original figure. Clearly, differences in my linear regression line compared to the original figure can be attributed to a difference in data points being used to fit this line. This difference in data points may have been due to the large number of NAs that were dropped from each variable. Potentially, some of these NA entries in the published dataset represent data that is missing from their dataset used by the authors in their analyses and visualizations. In addition to this, it is possible that the kinematic variables were transformed in some additional way by Dunham et al (2020a) that is not stated along with the transformations outlined in the article.

References

Dunham, N. T., McNamara, A., Shapiro, L. J., Phelps, T., & Young, J. W. (2019). Effects of substrate and phylogeny on quadrupedal gait in free-ranging platyrrhines. Am J Phys Anthropol, 170, 565–578. DOI:10.1002

Dunham, N. T., McNamara, A., Shapiro, L. J., Phelps, T., & Young, J. W. (2020a). Asymmetrical gait kinematics of free-ranging callitrichine primates in response to changes in substrate diameter and orientation. J Exp Biol 15, 223(12), jeb217562. DOI:10.1242

Dunham, N. T., McNamara, A., Shapiro, L. J., Phelps, T., Young, J. W. (2020b). Dunham et al. “Asymmetrical gait kinematics of free-ranging callitrichines in response to changes in substrate diameter and orientation”. figshare. Dataset. DOI:10.6084

Stack Overflow. (n.d.). Error in initializePtr function: CHOLMOD factor ldeta not provided by package. Retrieved from Stack Overflow